{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Logging traces with Opik\n",
    "\n",
    "For this guide we will be downloading the essays from Paul Graham and use them as our data source. We will then start querying these essays with LlamaIndex and logging the traces to Opik.\n",
    "\n",
    "## Creating an account on Comet.com\n",
    "\n",
    "[Comet](https://www.comet.com/site) provides a hosted version of the Opik platform, [simply create an account](https://www.comet.com/signup?from=llm) and grab you API Key.\n",
    "\n",
    "> You can also run the Opik platform locally, see the [installation guide](https://www.comet.com/docs/opik/self-host/self_hosting_opik/) for more information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import getpass\n",
    "\n",
    "if \"OPIK_API_KEY\" not in os.environ:\n",
    "    os.environ[\"OPIK_API_KEY\"] = getpass.getpass(\"Opik API Key: \")\n",
    "if \"OPIK_WORKSPACE\" not in os.environ:\n",
    "    os.environ[\"OPIK_WORKSPACE\"] = input(\n",
    "        \"Comet workspace (often the same as your username): \"\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you are running the Opik platform locally, simply set:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# import os\n",
    "# os.environ[\"OPIK_URL_OVERRIDE\"] = \"http://localhost:5173/api\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Preparing our environment\n",
    "\n",
    "First, we will install the necessary libraries, download the Chinook database and set up our different API keys."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install opik llama-index llama-index-agent-openai llama-index-llms-openai --upgrade --quiet"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And configure the required environment variables:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import getpass\n",
    "\n",
    "if \"OPENAI_API_KEY\" not in os.environ:\n",
    "    os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\n",
    "        \"Enter your OpenAI API key: \"\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In addition, we will download the Paul Graham essays:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import requests\n",
    "\n",
    "# Create directory if it doesn't exist\n",
    "os.makedirs(\"./data/paul_graham/\", exist_ok=True)\n",
    "\n",
    "# Download the file using requests\n",
    "url = \"https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt\"\n",
    "response = requests.get(url)\n",
    "with open(\"./data/paul_graham/paul_graham_essay.txt\", \"wb\") as f:\n",
    "    f.write(response.content)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using LlamaIndex\n",
    "\n",
    "### Configuring the Opik integration\n",
    "\n",
    "You can use the Opik callback directly by calling:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import set_global_handler\n",
    "\n",
    "# You should provide your OPIK API key and Workspace using the following environment variables:\n",
    "# OPIK_API_KEY, OPIK_WORKSPACE\n",
    "set_global_handler(\n",
    "    \"opik\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that the callback handler is configured, all traces will automatically be logged to Opik.\n",
    "\n",
    "### Using LLamaIndex\n",
    "\n",
    "The first step is to load the data into LlamaIndex. We will use the `SimpleDirectoryReader` to load the data from the `data/paul_graham` directory. We will also create the vector store to index all the loaded documents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n",
    "\n",
    "documents = SimpleDirectoryReader(\"./data/paul_graham\").load_data()\n",
    "index = VectorStoreIndex.from_documents(documents)\n",
    "query_engine = index.as_query_engine()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can now query the index using the `query_engine` object:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "response = query_engine.query(\"What did the author do growing up?\")\n",
    "print(response)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can now go to the Opik app to see the trace:\n",
    "\n",
    "![LlamaIndex trace in Opik](https://raw.githubusercontent.com/comet-ml/opik/main/apps/opik-documentation/documentation/static/img/cookbook/llamaIndex_cookbook.png)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "py312_llm_eval",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
